Visual Speech Synthesis With Concatenative Speech

نویسندگان

  • Asa Hallgren
  • Bertil Lyberg
چکیده

Today synthetic speech is often based on concatenation of natural speech, i.e. units such as diphones or polyphones are taken from natural speech and are then put together to form any word or sentence [5]. So far there have mainly been two ways of adding a visual modality to such a synthesis: Morphing between single images or concatenating video sequences. In this study, however, a new method is presented where recorded natural movements of points on the face are used to control an animated face.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...

متن کامل

Comprehensive many-to-many phoneme-to-viseme mapping and its application for concatenative visual speech synthesis

The use of visemes as atomic speech units in visual speech analysis and synthesis systems is well-established. Viseme labels are determined using a many-to-one phoneme-to-viseme mapping. However, due to the visual coarticulation effects, an accurate mapping from phonemes to visemes should define a many-to-many mapping scheme. In this research it was found that neither the use of standardized no...

متن کامل

FSM and k-nearest-neighbor for corpus based video-realistic audio-visual synthesis

In this paper we introduce a corpus based 2D videorealistic audio-visual synthesis system. The system combines a concatenative Text-to-Speech (TTS) System with a concatenative Text-to-Visual (TTV) System to an audio lipmovement synchronized Text-to-Audio-Visual-Speech System (TTAVS). For the concatenative TTS we are using a Finite State Machine approach to select non-uniform variablesize audio ...

متن کامل

An auditory-based distortion measure with application to concatenative speech synthesis

This study presents a new auditory-based distance measure with application to concatenative speech synthesis. This measure employs the Carney auditory model to produce a feature vector related to auditory perception. For concatenative synthesis, the new measure is employed to assess perceived discontinuities at segment transitions. Evaluations using a restricted data base environment show that ...

متن کامل

Generating emotional speech with a concatenative synthesizer

We describe the attempt to synthesize emotional speech with a concatenative speech synthesizer using a parameter space covering not only f0, duration and amplitude, but also voice quality parameters, spectral energy distribution, harmonics-to-noise ratio, and articulatory precision. The application of these extended parameter set offers the possibility to combine the high segmental quality of c...

متن کامل

Synchronization of speech frames based on phase data with application to concatenative speech synthesis

Synchronization of speech frames is an important issue in a concatenative speech synthesis system. In terms of signal processing this is translated in removing linear phase mismatches between concatenated speech frames. This paper presents two novel approaches to the problem of synchronization of speech frames with an application to concatenative speech synthesis. Both methods are based on a pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998